Multi-objective Topic Modeling

نویسندگان

  • Osama Khalifa
  • David W. Corne
  • Mike J. Chantler
  • Fraser Halley
چکیده

Topic Modeling (TM) is a rapidly-growing area at the interfaces of text mining, artificial intelligence and statistical modeling, that is being increasingly deployed to address the ’information overload’ associated with extensive text repositories. The goal in TM is typically to infer a rich yet intuitive summary model of a large document collection, indicating a specific collection of topics that characterizes the collection each topic being a probability distribution over words along with the degrees to which each individual document is concerned with each topic. The model then supports segmentation, clustering, profiling, browsing, and many other tasks. Current approaches to TM, dominated by Latent Dirichlet Allocation (LDA), assume a topic-driven document generation process and find a model that maximizes the likelihood of the data with respect to this process. This is clearly sensitive to any mismatch between the ’true’ generating process and statistical model, while it is also clear that the quality of a topic model is multi-faceted and complex. Individual topics should be intuitively meaningful, sensibly distinct, and free of noise. Here we investigate multi-objective approaches to topic modeling, which attempt to infer coherent topic models by navigating the tradeoffs between objectives that are oriented towards coherence as well as converge of the corpus at hand. Comparisons with LDA show that adoption of MOEA approaches enables significantly more coherent topics than LDA, consequently enhancing the use and interpretability of these models in a range of applications, without any significant degradation in the models’ generalization ability.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring a Dynamic Efficiency Based on MONLP Model under DEA Control

Data envelopment analysis (DEA) is a common technique in measuring the relative efficiency of a set of decision making units (DMUs) with multiple inputs and multiple outputs. ‎‎Standard DEA models are ‎‎quite limited models‎, ‎in the sense that they do not consider a DMU ‎‎at different times‎. ‎To resolve this problem‎, ‎DEA models with dynamic ‎‎structures have been proposed‎.‎In a recent pape...

متن کامل

Multi-objective Modeling Based on Competition Airlines Cooperation by Game Theory and Sustainable Development Approach

In each time period, the demand of passengers for each route are finite and airlines compete for earning more profits. The complex competition among airlines causes problems, such as complicating flight planning and increasing empty seats for some routes. These problems increase air pollution and fuel consumption. To solve these problems, this research studies the cooperation of the airlines wi...

متن کامل

Tutorial on Probabilistic Topic Modeling: Additive Regularization for Stochastic Matrix Factorization

Probabilistic topic modeling of text collections is a powerful tool for statistical text analysis. In this tutorial we introduce a novel non-Bayesian approach, called Additive Regularization of Topic Models. ARTM is free of redundant probabilistic assumptions and provides a simple inference for many combined and multi-objective topic models.

متن کامل

Modeling and Multi-Objective Optimization of Stall Control on NACA0015 Airfoil with a Synthetic Jet using GMDH Type Neural Networks and Genetic Algorithms

This study concerns numerical simulation, modeling and optimization of aerodynamic stall control using a synthetic jet actuator. Thenumerical simulation was carried out by a large-eddy simulation that employs a RNG-based model as the subgrid-scale model. The flow around a NACA0015 airfoil, including a synthetic jet located at 10 % of the chord, is studied under Reynolds number Re = 12.7 × 106 a...

متن کامل

Application of Genetic Algorithm to Determine Kinetic Parameters of Free Radical Polymerization of Vinyl Acetate by Multi-objective Optimization Technique

A Multi-objective optimization procedure has been developed to determine some kinetic parameters of free radical polymerization of vinyl acetate based on genetic algorithm. For this purpose, mathematical modeling of free radical polymerization of vinyl acetate is carried out first and then selected kinetic parameters are optimized by minimizing objective functions defined from comparing exp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013